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ABSTRACT 



A method of visually identifying speaking participants in a 
multi-participant event such as an audio conference or an 
on-line game includes the step of receiving packets of 
digitized sound from a network connection. The identity of 
the participant associated with each packet is used to route 
the packet to a channel buffer or an overflow buffer. Each 
channel buffer may be assigned to a single participant in the 
multi-participant. A visual identifier module updates the 
visual identifier associated with participants that have been 
assigned a channel buffer. In some embodiments, the appear- 
ance of the visual identifier associated with the participant is 
dependent upon the differential of an acoustic parameter 
derived from content in the associated buffer channel and a 
reference value stored in a participant record. 

25 Claims, 7 Drawing Sheets 
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SYSTEM AND METHOD FOR VISUALLY includes several participants, they have difficulty determin- 

IDENTIFYING SPEAKING PARTICIPANTS ing who is speaking. Some Internet telephony systems have 

IN A MULTI-PARTICIPANT NETWORKED attempted to remedy this deficiency by requiring (i) that only 

EVENT one speaker talk at any given time and/or by (ii) posting, on 

5 each client associated with a participant in the multi- 

CROSS-REFERENCE TO RELATED participant event, the icon of the current speaker. However, 

DOCUMENTS such solutions to the problem in the art are unsatisfactory 

^ because effective multi-participant communication requires 

The present invention is related to the subject matter ± meir be abm for fflulti fc fc simultaneously 

disclosed in US patent application Sen No. 09/358*77 J( , ^ £ concept P of P waiting m ^ for J 

( Apparatus and Method for Creating Audio Forums ) filed *. „ tn • t n / . , r ~ , , . 

. 1 ^ j..^ . . ,. ■ ~ *t chance to speak is not a satisfactory solution to the problem 

Jul. 22, 1999 and U.S. patent application Ser. No. 09/358, m the art 

878 ("Apparatus and Method for Establishing An Audio * . . ■ . ■ ■ . , . . 
Conference in a Networked Environment") filed Jul. 22, At } ald drOT f back ™& P™\ ^ ^ te f is that they 
1999. The present invention is also related to the subject 1S P rovideQ ° mechanism for associating the characteristics of 
matter disclosed in U.S. Pat. No. 5,764,900 ("System and 15 a P art »«pant Wlth a ^ """"liter * displayed on the 
Method for Communicating Digitally-Encoded Acoustic client associated with each participant in the multi- 
Information Across a Network between Computers"). These P artlc 'f ant event. Such charactenst.es could be, for 
related documents are commonly assigned and hereby incor- exan ?P le '. a VBU f 1 representation of how loudly . particular 
porated by reference. speaker is speaking relative to some historical base state 
_. ,. . . . , . . 20 associated with the participant. A fourth drawback of prior 
Tnis application c aims priority to the provisional patent art Imerne , , el h s tha , m i(Jc aQ ^ 

application entitled System and Method For Visually Ida- ivfl hierarch for dictatin who ma artid . 

hfyuig Speaking Plants In » Multi-Particyant Net- in , ticular multi artici t event . For le> ^ 

worked Event," Ser. No. 60/113,644, filed Dec. 23, 1998. {y pical p £ 0[ ^ systemS) t P here ^ privilege hierarch P y ^ 

BRIEF DESCRIPTION OF THE INVENTION 25 any User ' *f ' ^.Py 01 * 0 ' may join the multi-participant event. 

Such multi-participant events can be designated as "public 

The present invention discloses an apparatus and method forums." While public forums serve a limited purpose, they 

for identifying which participants in a multi-participant suffer from the drawback that there is no protection against 

events are speaking. Exemplary multi-participant events hecklers or otherwise disruptive participants in the event. To 

include audio conferences and an on-line games. 30 summarize this point, prior art systems are unsatisfactory 

because they do not provide a set of hierarchical privileges 

BACKGROUND OF THE INVENTION that are associated with a participant and that allow partici- 

Historically, multi-participant events such as multi-party P a ^tol esi 8 na | e 7 en ^ as P riva,e 'P ub 1 ^ orm f oderated As 
conferences have been hosted using Public Switched Tele- „ Used ,n con ' ex V P"™ 6 «vente include conference calls 
phone Networks (PSTNs) and/or commercial wireless net- 35 ™ ^J 1 " P artlcl P ants are Preselected, typically by each 
works. Although such networks allow multiple participants ? ther -°« her us « s f a svs ! em mav not jom the event unless 
to speak at once, they are unsatisfactory because they mvited by one of the existmg participants. Public events are 
provide no means for visually identifying each participant in * os f ln J h,ch an » OT ? can J om and . ^ at an y time - 
the event. More recently, teleconferencing systems that rely 4n Moderated events may be public or private, but require that 
on Internet Protocol based networks have been introduced ° a ' leas f °° e P artlcl P an « 06 given enhanced privileges, such as 
Such systems, which enable two or more persons to speak to ^ P"^^ to 6Xcl " d ^ P^ular Participants, invite par- 
each other using the Internet, are often referred to as t,clpants or and den y 'P**"* P™leges to partici- 
"Internet telephony." panls - . 

. • i j j- * j What is needed in the art is an Internet telephony system 

Multi-participant events include audio conferences and 45 _„j .... ,. , . „ \ A . 

,. r K, , . . • 11 1 .1. and method that provides the tools necessary to conduct an 

on-hn games. Such events typicaUy rely on the conversion effec , ive multi P ticipant event . Such a sy * em shoukJ not 

of analog speech to digitized speech. The digitized speech ,s haye 0Q ^ number ^ , hat 

routed to all other participants across a network using the Mnm „^ n *u, \?,.^u a , ™u . u u J 

t„*~™* d ♦ 1 /«id»\ a « • iry) «w?vin» concurrently speak. Further, such a system should provide 

Internet Protocol ( IP O and voice over IP or VOIF A ; c *j l-c • iL ^- ■ . • 

t . , • » j* 1 u * . . i*. an adequate way of identifying the participants in the 

technologies. Accordingly, each participant to the mult,- 50 multi-participant event, 
participant event has a client computer. When a participant 

speaks, the speech is digitized and broken down into packets SUMMARY OF THE INVENTION 

that may be transferred to other participants using a protocol The system and method of the present invention addresses 

such as IP, transmission control protocol (TCP), or user the need in the art by providing an Internet telephony system 

datagram protocol (UDP). See, for example, Peterson & 55 and method that visually identifies the participants in a 

Davie, Computer Networks, 1996, Morgan Kaufmann multi-participant event. In the present invention, there is no 

Publishers, Inc., San Francisco, Calif. limitation on the number of participants that may concur- 

While prior art Internet |elephony is adequate for limited rently speak. Each participant in a multi-participant event is 

purposes, such as a basic two-party conference call in which associated with a visual identifier. The visual identifier of 

only one participant speaks at any given time, prior art 60 each participant is displayed on the client display screen of 

telephony systems are unsatisfactory. First, they frequently the respective participants in the multi-participant event. In 

do not permit multiple participants to speak at the same time one embodiment, at least one characteristic of the participant 

without data loss. That is, if one participant speaks, the is reflected in the visual identifier associated with the 

participant typically cannot hear what other people said participant. Further, the system and method of the present 

while the participant was speaking. Second, prior art tele- 65 invention addresses the unmet need in the art by providing 

phony does not adequately associate a visual identifier with participants with the flexibility to assign a privilege hierar- 

each participant. Therefore, when a multi-participant event chy. Using this privilege hierarchy, events may be desig- 
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nated as public, private, of moderated and selected partici- (iii) to an overflow buffer when the identity of the packet 

pants may be granted moderation privileges. does not match an identity of a channel buffer and there 

A system in accordance with one aspect of the present is no available channel buffer, 
invention includes a participant data structure comprising a A different visual identifier is associated with each par- 
plurality of participant records. Each participant record is 5 ticipant in the multi-participant event. In some embodiments 
associated with a different participant in a multi-participant 0 f the present invention, the appearance of the visual iden- 
event. Multi-participant events of the present invention tifier is determined by whether the identity of the participant 
include audio conferences and on-line games. Further, sys- associated with the visual identifier matches an identity 
terns in accordance with one aspect of the present invention associ ated with a channel buffer. In other embodiments of 
include an application module, which provides a user inter- 10 the t invention , the appearance of the visual identifier 
face to the multi -participant event, and a sound control ^ determined b lhe di£ference between an acoustic param . 
module that is capable of receiving packets from a network . . • j r l • u 1 u «■ 

r . , t . * » j *t. _** * * ■ cter denved from digitized speech m a channel buffer 

connection. Each packet is associated with a participant in -.j ^,. r 

the multi-participant event and includes digitized speech assoc^d with the participant and a reference acoustic 

from the participant. The sound controller has a set of P^ter stored in a participant record, 

buffers. Each buffer preferably manages packets as a first-in 15 BRIEF DESCRIPTION OF THE DRAWINGS 
nrst-out queue. The sound controller further includes a 

packet controller that determines which participant is asso- For a better understanding of the invention, reference 
ciated with each packet that has been received by the should be made to the following detailed description taken 
network connection. The sound controller routes the packet in conjunction with the accompanying drawings, in which: 
to a buffer based on the identity of the participant associated 20 piG ± iUustrates a s tem for iderjtifyirjg which partici . 
with the packet. The sound controller also includes a v^ual m a multi artici t event m akin m accc ; rdanC e 
identification module for determining which participants in r W1 _ , J. * r • 
a. ™ n- *• • ♦ * 1- tl r • f j *• wit" one embodiment of the invention, 
the multi-participant event are speaking. The visual identi- 
fication module updates the visual identifier associated with FIG - 2 illustrates a participant data structure in accordance 
each participant that is speaking to reflect the fact that they 2 5 one embodiment of the invention, 
are speaking. Further the visual identification module FIG. 3 illustrates a flow diagram of the processing steps 
updates the visual identifier associated with each participant associated with updating the visual identifier associated with 
that is not speaking to reflect the fact that they are not a participant on a client computer in accordance with one 
speaking. Finally, systems in accordance with a preferred embodiment of the invention. 

embodiment of the present invention include a sound mixer „ - A . , . , . - , , . 

for mixing digitized speech from at least one of the buffers 30 ™5. 4 15 j more detailed view of how a sound control 

to produce a mixed signal that is presented to an output module «««faccs with components of memory in a 

dev j ce client computer in accordance with one embodiment of the 



In some embodiments of the present invention, the par- 



lnvention. 



DETAILED DESCRIPTION OF THE 
INVENTION 



ticipant record associated with each participant includes a PIG. 5 illustrates the structure of a channel buffer in one 

reference speech amplitude. In such embodiments, the visual 35 embodiment of the invention. 

identification module determines a buffered speech ampli- FIG. 6 illustrates the processing steps associated with 

tude based upon a characteristic of digitized speech in at identifying which participants in a multi-participant event 

least one packet, associated with said participant, that is are spe aking in accordance with one embodiment of the 

managed by a buffer and computes a speech amplitude present invention. 

differential based on (i) the buffered speech amplitude and 4 0 ^ * * i- i ■« * ? iL 
(ii) the reference speech amplitude stored in the participant FIC f 7a ' 7c are * lllustratlon <* * e visual 
record. The visual identifier associated with the participant ldentlfie r s associated with N participants tn a multi- 
is updated based on this speech amplitude differential. participant event. 

Further, the buffered speech amplitude is saved as a new Like reference numerals refer to corresponding parts 

reference speech amplitude in the participant record asso- 45 throughout the several views of the drawings, 
ciated with the participant. 

In a method in accordance with the present invention 
packets are received from a remote source and an identity 

associated with each packet is determined. In one FIG. 1 illustrates a client/server computer apparatus 10 

embodiment, this identity does not disclose the true identity 50 incorporating the technology of the present invention. The 

of the participant. In such embodiments, the identity could apparatus 10 includes a set of client computers 22 which are 

be a random number assigned to the participant for the each linked to a transmission channel 84. The transmission 

duration of the multi-participant event. The identity of each channel 84 generically refers to any wire or wireless link 

packet received from the remote source is compared with an between computers. The client computers 22 use transmis- 

identity associated with a channel buffer. The identity asso- 55 sion channel 84 to communicate with each other in a 

ciated with each channel buffer is determined by an identity multi-participant event. The multi-participant event could be 

of a packet stored by the channel buffer. In one embodiment, regulated by a server computer 24 or other server computers 

a channel buffer is reserved for a single participant in the designated by server computer 24. 

multi-participant event at any given time. When the channel Each client computer 22 has a standard computer con- 

buffer is not storing a packet the channel buffer is not 60 figura tion including a central processing unit (CPU) 30, 

associated with a participant and is considered "available." network interface 34, and memory 32. Memory 32 stores a 

Packets are routed based on the following rules: ^ of executable programs. Client computer 22 also 

(i) to a channel buffer when the identity of the packet includes input/output device 36. Input/output device 36 may 
matches the identity associated with a channel buffer; include a microphone, a keyboard, a mouse, a display 38, 

(ii) to an available channel buffer when the identity of the 65 and/or one or more speakers. In one embodiment, the 
packet does not match an identity of a channel buffer; microphone is PC 99 compliant with a close speaking 
and headset design having a full scale output voltage of 100 m V. 
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Further, in one embodiment, the microphone has a frequency 
response of ±5 db from 100 Hz to 10 kHz, ±3 dB from 300 
Hz to 5 kHz and 0 db at 1 kHz. The microphone has been 
implemented with a minimum sensitivity of -44 dB relative 
to 1 V/Pa. CPU 30, memory 32, network interface 34 and 
input/output device 36 are connected by bus 68. The execut- 
able programs in memory 32 include operating system 40, 
an application module 44 for providing a user interface to 
the multi-participant event, a participant data structure 46 for 
storing information about each participant in a multi- 
participant event, and a sound control module 48. Sound 
control module 48 receives sound from remote participants 
through network interface 34 and transmits sound from the 
local participant, which is associated with client 22, to 
remote participants across transmission channel 84. Memory 
34 also includes sound mixer 66 for combining the sound of 
each participant in the multi-participant event into a single 
signal that is sent to input/output device 36. In a preferred 
embodiment, operating system 40 is capable of supporting 
multiple concurrent processes or threads and includes sound 
mixer 66. In an even more preferred embodiment, operating 
system 40 is a WIN32 environment or an environment that 
provides functionality equivalent to WIN32. 

FIG. 1 illustrates that each client 22 is associated with a 
local participant in the multi-participant event. The local 
participant uses input/output device 36 to communicate to 
remote participants in the multi-participant event via trans- 
mission channel 84. Sound control module 48 has instruc- 
tions for routing sound from the local participant to the 
remote participants and for receiving sound from remote 
participants. To receive sound from remote participants, 
sound control module 48 includes a plurality of receive 
sound buffers 50. In a preferred embodiment, one of the 
receive sound buffers is an overflow buffer 54 and each of 
the remaining receive sound buffers is a channel buffer 52. 
In a preferred embodiment, receive sound buffers 50 com- 
prises four channel buffers 52 and one overflow buffer 54. 
Sound control module 48 further includes a packet controller 
56 for determining the participant associated with a packet 
of sound received from a remote participant and for routing 
the packet to the appropriate receive sound buffer 50. In 
addition, sound control module 48 includes a visual identi- 
fier module 60 that determines which participants in a 
multi-participant event are speaking. 

Sound from the local participant is stored in a transmit 
sound buffer 62 and routed, to the appropriate destination by 
transmit router 64. Transmit router 64 breaks the signal in 
transmit sound buffer 62 into packets and places the appro- 
priate header in each packet. Typically, the header includes 
routing information that will cause the packet to be sent to 
server 24 via transmission channel 84. Server 24 will then 
route the packet to all participants in the multi-participant 
event. However, in some embodiments, transmit router 64 
may direct the packets to other clients 22 directly instead of 
through server 24. 

Server 24 in system 10 includes a network interface 70 for 
receiving sound from clients 22 and for directing the sound 
to each client 22 that is participating in a multi-participant 
event. Server 24 further includes CPU 72 and memory 76. 
Network interface 70, CPU 72 and memory 76 are con- 
nected by bus 74. In a typical server 24, memory 76 includes 
one or more server applications 78 for tracking multi- 
participant events hosted by the server. Memory 76 further 
includes the profile of each user that has the privilege of 
using server 24 to participate in multi-participant events. 
These profiles are stored as user data 80. An identity of each 
participant in a multi-participant event hosted by server 24 
is stored in memory 76 as participant data 82. 
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The general architecture and processing associated with 
the invention has now been disclosed. Attention presently 
turns to a more detailed consideration of the architecture of 
the invention, the processing of the invention, the distinc- 
tions between these elements and corresponding elements in 
the prior art, and advantages associated with the disclosed 
technology. 

FIG. 2 provides a detailed view of participant data struc- 
ture 46 that is used in one embodiment of the present 
invention. Data structure 46 includes a record 202 for each 
participant in an multi-participant event. Each record 202 
includes a participant source identifier 204. In one embodi- 
ment of the present invention, participant source identifier 
204 does not provide information that identifies the actual 
(true) identity of a participant in the multi-participant event. 
In such embodiments, participant source identifier 204 does 
not include the IP address or name of the remote participant. 
For example, server application 78 (FIG. 1) may assign each 
participant a random number when the participant joins a 
multi-participant event. This random number is transiently 
assigned for the duration of the multi-participant event and 
cannot be traced to the participant. In other embodiments, 
participant source identifier 204 is the true identity of the 
participant or is a screen name of the participant. In such 
embodiments visual ID 206 (FIG. 3) identifies the associated 
participant. Thus, each participant is aware of exactly who 
is included in the multi-participant event. 

In one embodiment, visual ID 206 is an icon that repre- 
sents the participant. Visual ID 206 may be randomly chosen 
from a library of icons by sound control module 48 when a 
participant joins the multi-participant event. Automatic 
assignment of a visual ID 206 to each participant has the 
advantage of preserving participant anonymity. 
Alternatively, visual ID 206 is selected by the participant or 
uniquely identifies the participant by, for example, including 
the participant's actual name, screen name, and/or a picture 
of the participant. 

In complex applications, a local participant may engage in 
several concurrent multi -participant events using client 22. 
Each multi-participant event will be assigned a window by 
application module 44 (FIG. 1). In such embodiments, it is 
necessary to provide a visual ID window field 208 in record 
202 to indicate which window visual ID 208 is located in 
and/or which multi-participant event visual ID 206 is asso- 
ciated with. Each record 202 further includes the position 
that visual ID 204 occupies in visual ID window 208. In 
embodiments that do not support concurrent multi- 
participant events on a single client 22, visual ID window 
field 208 is not required. In such embodiments, visual ID 
position 210 represents the position of visual ID in the 
window assigned to application module 44 by operation 
system 40 (FIG. 1). 

One advantage of the present system over the prior art is 
the use of visual ID state 212, which represents a state of 
participant 204. In some embodiments, visual ID state 212 
is a single bit. When participant 204 is speaking the bit is set 
and when participant 204 is not speaking the bit is not set. 
In other embodiments, visual ID state 212 is selected from 
a spectrum of values. The low end of this spectrum indicates 
that participant 204 is not speaking and the high end of the 
spectrum indicates that participant 204 is speaking much 
louder then normal. Regardless of how visual ID state 212 
is configured, it is used to modify the appearance of visual 
ID 206 on input/output device 36. For example, in embodi- 
ments where visual ID state 212 is a value selected from a 
spectrum, the value stored by visual ID state 212 may be 
used to determine visual ID 206 brightness. As an 
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illustration, when visual ID 206 is a two-dimensional array 
of pixel values, each pixel value in the array may be adjusted 
by a constant determined by the value of visual state 212. 
Thus, visual ID states 212 at the high end of the spectrum 
alter the average pixel value in the array by a larger constant 
than visual ID states 212 in the low end of the spectrum. In 
an alternative embodiment, visual ID state 212 is used to 
shift the color of each pixel in visual ID 206. As an 
illustration, when visual ID state 212 is at the high end of the 
spectrum, visual ID 206 is red-shifted and when visual ID 
state 212 is at the low end of the spectrum, visual ID 206 is 
green-shifted. In yet another embodiment, the particular 
visual ID 206 is selected from a library of visual IDs to 
represent participant 202 based on a function of the value of 
visual ID state 212. For instance, a particularly animated 
visual ID 206 may be selected to represent participant 202 
when the participant is speaking loudly. 

In yet another embodiment, visual state 212 includes 
information about the participant, such as whether the par- 
ticipant is speaking, has certain privileges, has placed the 
event on hold, or is away from the keyboard. In one 
embodiment, visual slate 212 specifies whether the partici- 
pant has moderation privileges, such as the privilege to 
invite potential participants to the multi-participant event, 
the privilege to remove people from the multi-participant 
event, or the privilege to grant and deny speaking privileges 
to other participants in the multi-participant event. In some 
embodiments, a participant may place an event on hold in 
the classic sense that the event is muted on client 22 and 
sound is not communicated from or to the client 22 associ- 
ated with the participant. Visual state 212 may be used to 
reflect the fact that the participant is on hold. In some 
embodiments, the participant may also have the option to 
designate that he is "away from the keyboard." This state is 
used to inform other participants that the participant is not 
within earshot of the multi-participant event but will be back 
momentarily. It will be appreciated that any information 
included in visual state 212 may be used to update the 
appearance of the visual ID 206 associated with the partici- 
pant. 

Referring to FIG. 3, detailed steps that describe how 
sound from a local participant is processed by sound control 
module 48 is shown. In step 302, sound control module 48 
monitors a microphone 86 (FIG. 1) for sound. In the 
embodiment depicted in FIG. 3, when the amplitude of 
sound detected by microphone 36 exceeds a base state 
(302-Yes), processing step 304 is triggered. In processing 
step 304, the signal is stored in transmit sound buffer 62, 
packaged into a packet and routed to remote participants of 
the multi -participant event by transmit router 64 (FIG. 1). In 
one embodiment, transmit router 64 will forward the packet 
to server 24 and server 24 will route the packet to remote 
participants based on participant data 82 information stored 
in memory 76 (FIG. 1). Such an embodiment is capable of 
preserving the anonymity of each participant in the multi- 
participant event. In an alternative embodiment, the identity 
of the participant is included in the packet and participant 
anonymity is precluded. In processing step 306, the visual 
ID 206 (FIG. 2) of the local participant is updated. If the 
visual ID 206 (FIG. 2) corresponding the local participant is 
not in state 1 (306-No), processing step 308 is executed to 
set visual ID 206 to state "1". Processing steps 306 and 308 
represent an embodiment in which visual ID is set to "1" 
when the local participant is speaking and "2" when the local 
participant is not speaking. One of skill in the art will 
appreciate that visual ID state 212 could be assigned any 
number in a spectrum and this number could be used to 
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adjust the appearance of visual ID 206. Further, it will be 
appreciated that processing steps 306 and 308 or their 
equivalents could be executed prior to processing step 304 
and that the updated value of visual ID state 212 could be 
packaged into the sound packet that is transmitted in pro- 
cessing step 304. Other clients 22 could then use the value 
of visual ID state 212 to update the appearance of the 
associated visual ID 206. Once processing steps 304 thru 
308 have been executed, the process repeats itself by return- 
( ing to processing step 302. 

Processing step 302 is a continuous input function. Micro- 
phone 36 constantly detects sound and system 10 is capable 
of storing sound from transmit sound buffer 62 during the 
execution of processing steps 304 thru 318. In a WIN32 
environment, step 302 may be implemented using Microsoft 
Windows DirectSoundCapture. Techniques well known in 
the art may be used to control the capture of sound from 
microphone 36 so that the speech from the local participant 
is stored in transmit sound buffer 62 and periods of silence 
or lengthy pauses in the speech of the local participant are 
not stored in the buffer. In one embodiment, "frames" of 
sound from microphone 36 are acoustically analyzed. Pref- 
erably each frame of sound has a length of 500 milliseconds 
or less. More preferably each frame of sound has a length of 
250 milliseconds or less and more preferably 100 millisec- 
onds or less. In an even more preferred embodiment, each 
frame has a length of about ten milliseconds. Typically, the 
acoustic analysis comprises deriving parameters such as 
energy, zero-crossing rates, auto-correlation coefficients, lin- 
ear predictive coefficients, PAR COR coefficients, LPC 
cepstrum, mel-cepstrum, Karhunen-Loeve transform (KLT) 
parameters, maximum likelihood ratio (MLR) criteria and/or 
spectra from each frame and using these parameters to 
determine whether the input frame is speech or noise. In 
some embodiments, speech patterns of the local participant 
may be stored by client 22 and matched with the input 
frames in order to predict whether any given frame is noise 
(302-No) or speech (302-Yes). In alternative embodiments, 
only the energy value of a particular frame is used to 
determine whether the frame is speech or noise. In such 
embodiments, the calculated energy level of a frame is 
compared to a threshold energy value. When the energy 
value of the frame exceeds the base state (302-Yes) the 
frame is stored in transmit sound buffer 62. In some 
embodiments, the value used for the base state (threshold 
value) is updated continuously based on parameters derived 
from an acoustic analysis of the content of transmit sound 
buffer 62. 

It will further be appreciated that noise suppression tech- 
niques may be implemented to improve step 302 and to 
suppress background noise from speech stored in transmit 
sound buffer 62. For example, input device 36 may be a 
noise canceling microphone. Additionally, input device 36 
may comprise two microphones. The two microphones may 
be used as a phased array to give enhanced response in the 
direction of the local participant. Alternatively, one micro- 
phone (the "speech microphone") collects the speech of the 
local participant plus background noise and the other micro- 
phone (the "reference microphone") is positioned so that it 
collects only background noise. Although the noise wave- 
form of the speech and reference microphones may be 
different, filters such as a finite-impulse-response ("FIR") 
filter can be used to match the noise waveform at the speech 
microphone and the noise waveform at the reference micro- 
phone. Once a match is found in the two noise waveforms, 
input from the speech microphone may be filtered. 

When it is determined that the local participant has 
stopped speaking for a period of time (302-No), processing 
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step 310 is executed to check the status of the visual ID 206 
associated with the local participant. When visual ID 206 is 
not in state 2, a counter is incremented (step 312). If the 
counter exceeds a threshold count after processing step 312 
(314- Yes), visual ID 206 is set to state 2 (316) and the 5 
counter is reset to zero (318). Then the process repeats. 

FIG. 4 provides a detailed illustration of how sound 
control module 48 handles sound from multiple remote 
participants. As shown in FIG. 4, packets that contain sound 
from remote participants are received by network interface 10 
34 from transmission channel 84. The packets are then 
routed by packet controller 56, Packet controller 56 exam- 
ines each packet and determines the source (identity) of the 
packet. In one embodiment, each participant in the multi- 
participant event is assigned a random number by server 24 15 
for the duration of the multi-participant event. This random 
number is included in each packet transmitted by the par- 
ticipant and serves as the basis for the determination of 
packet identity by packet controller 56. As an illustration, 
consider an event that includes a first remote participant and 2 o 
a second remote participant. When the first remote partici- 
pant joins the event, server application 78 assigns the 
participant the number "12". When the second remote 
participant joins the event,server application 78 assigns the 
participant the number "81". Thereafter, for the duration of 2 s 
the event, each packet originating from the first participant 
will include the number "12" and each packet from the 
second participant will include the number "78". When 
packets from the first and second remote participants are 
routed to client 22, packet controller 56 will identify and 30 
route the packets based on the presence of the numbers "12" 
or "81" present in each packet. 

As shown in FIG. 4, in one embodiment of the present 
invention, sound control module 48 includes four channel 
buffers 52 and an overflow buffer 54. Preferably, each 35 
channel buffer is capable of storing any number of packets 
from a particular participant in the multi-participant event. 
An appreciation of how packet controller 56 routes packets 
to a channel buffers 52 or overflow buffer 54 is better 
understood after a preferred embodiment of channel buffer 40 
52 has been described. FIG. 5 describes such an embodi- 
ment. In FIG. 5, channel buffer 52 is a queue. Queues 
provide a structure that allows for the addition of packets at 
one end and the removal of packets at the other end. In this 
manner, the first packet placed in the queue is the first to be 45 
removed, thus approximating the common activity of wait- 
ing in line. Such a procedure is known to those of skill in the 
art as first in first out ("FIFO"). Packets 502 are placed onto 
a channel buffer 52 by enqueing the packet onto channel 
buffer 52. The most recently enqueued packet is known as 50 
the tail of channel buffer 52. Packets are removed from 
channel buffer 52 by the process of dequeuing. The packet 
that is the next to be dequeued is known as the head of the 
buffer. 

In one embodiment, channel buffer 52 is a linked list. 55 
However, one of skill in the art will appreciate that a number 
of different alternative methods may be used to implement 
a channel buffer. For example, in some embodiments, chan- 
nel buffers 52 may be supported directly by operating system 
40. In other embodiments, channel buffers 52 may be 60 
implemented directly in client 22 hardware. In preferred 
embodiments, there is no limitation in the number of packets 
each channel buffer 52 is capable of storing. However, at any 
given instance, each channel buffer 52 is reserved by packet 
controller 56 in order to process packets from a particular 65 
participant 202 (FIG. 2) in the multi -party event. Therefore, 
although there is no limitation imposed by packet controller 
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56 on the size of a particular channel buffer 52, each channel 
buffer 52 will only contain packets from a single participant 
202 (FIG. 2) in the multi-participant event at any given 
instance. 

In contrast to channel buffers 52, overflow buffer 54 may 
store packets from several different participants. The packets 
from each participant in the multi-participant event repre- 
sented in overflow buffer 54 will be processed by packet 
controller in a FIFO manner as described in more detail 
below. Importantly, the packets originating from each par- 
ticipant represented in overflow buffer 54 will be handled in 
a FIFO manner irrespective of the presence of packets from 
other participants that may be present in overflow buffer 54. 
As an illustration, if packets associated with participants A 
and B are both present in overflow buffer 54, the packets 
associated with participant A will be processed as a first 
FIFO and the packets from participant B will be processed 
as a second FIFO. Further, the first and second FIFOs will 
not be dependent on each other. 

Now that the structure of illustrative channel buffers 52 
and overflow buffer 54 have been described in more detail, 
attention returns to FIG. 4. In a first independent process, 
packet controller 56 determines whether overflow buffer 54 
is empty. If overflow buffer 54 is not empty and there is an 
available channel buffer 52, the packet at the head of 
overflow buffer 54 ("overflow head packet") will be routed 
to the tail of the first available channel buffer 52. Further, 
any packets that are associated with the same participant that 
overflow head packet is associated with will also be routed 
in a FIFO manner from overflow buffer 54 to the same 
available channel buffer 52 that the overflow head packet 
was routed. For example, if the overflow head packet was 
associated with participant A, then any remaining packets in 
overflow buffer 54 that are associated with participant A will 
be routed to the same available channel buffer 52 that the 
overflow head packet was routed. Further, the order by 
which such packets are routed to the available channel buffer 
from overflow buffer 54 will be based upon the order by 
which the packet was enqueued onto the overflow buffer 
(i.e., FIFO). 

In a second independent process, packet controller 56 
collects packets from network interface 34. For each packet 
received from network interface 34, packet controller 56 
determines the identity of the participant that is associated 
with the packet. In embodiments in which the true identity 
of the participants in the multi-participant event are not 
disclosed, the distinguishing information that the packet 
controller obtains from each packet is typically the random 
number that was assigned by server application 76 to the 
participant when the participant joined the multi-participant 
event (FIG. 1). Once the identity of a packet is determined, 
packet controller 56 routes the packet using the following 
two rules: 

Rule 1 Packet controller 56 compares the identity of the 
packet received from network interface 34 to the identity of 
the packet at the tail of each channel buffer 52. If there is a 
match, packet controller 56 routes the packet to this match- 
ing channel buffer regardless of the number of packets that 
are already on the buffer. When packet controller 56 routes 
the packet to the matching channel buffer 52, the packet is 
enqueued onto the buffer 52 and the routed packet therefore 
becomes the tail of the buffer. At a programming level 
enqueuing may involve the addition of the routed packet to 
a linked list that represents channel buffer 52. It will be 
appreciated that in a preferred embodiment packet controller 
56 keeps track of the identity of the participant that is 
assigned to each of the channel buffers and that there is no 
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need to actually query the tail of channel buffer 52, Referring to FIG. 4, visual identification module 60 

Preferably, there are four channel buffers in sound control determines which participants in a multi-participant event 

module 48 and packet controller 56 tracks which participant are speaking and updates the visual ID state 212 of partici- 

is assigned to each of the four buffers. pant data structure 46 (FIGS. 1 & 2). In one embodiment, 

Rule 2. When packet controller 56 cannot find a match 5 visual identification module 60 is a component of packet 
between the identity of the' received packet and the identity controller 56. When packet controller 56 reserves a channel 
associated with the packet at the tail of each channel buffer 52 for a participant (FIG. 4), visual ID state 212 of the 
52, the packet is routed to overflow buffer 54. For example, participant 202 (FIG. 2) is updated. In another embodiment, 
if there are five participants speaking simultaneously in a visual identification module 60 is an independent thread or 
multi-participant event and there are four channel buffers 52 10 process that periodically queries packet controller 56 to 
in sound control module 48, packets originating from the determine which participants are speaking using standard 
fifth participant will be routed into the overflow buffer 54 interprocess communication (IPC) protocols. In one 
until a channel buffer 52 becomes available. It will be embodiment, visual identification module 60 queries packet 
appreciated that, in this example, it is possible that some controller 56 every one hundred milliseconds and updates 
packets from the fifth participant will be routed to overflow 15 participant data structure 46 and/or output device 38. 
buffer 54 even though the packets from the fifth participant Sound mixer 66 dequeues the head packet of each channel 
were received by packet controller 56 before some of the buffer 52 in sound control module 48 and mixes the signal 
packets from the first four participants were received. This using techniques well known in the art. For example, when 
situation arises because, when packet controller 56 routes a a suitable WIN32 environment that includes suitable Direct- 
packet to a channel buffer 52, the packet controller reserves 2 o Sound components or their equivalents are present in oper- 
the channel buffer 52 for packets that have the same identity ating system 40, sound mixer 66 may be supported directly 
as the routed packet until the channel buffer 52 is emptied. at the operating system level. The digitized sound produced 

The details of the two independent processes that are by sound mixer 60 is converted to analog signals using a 

executed by sound control module 48 have now been digital-to-analog converter and presented to various input/ 

disclosed. The first process describes the method by which 25 output devices 402 (36 FIG. 1) such as a headphone jack 

overflow buffer 54 is dequeued. The second process and/or one or more speakers. 

describes the rules used by packet controller 56 to route Now that a detailed description sound control module 48 

packets received from network interface 54. In one embodi- has been described, a number of advantages of the present 

ment these two processes are run as two independent system will be apparent to those of skill in the art. A 

threads. In embodiments where multiple independent 30 participant in a multi-participant event can easily keep track 

threads or processes are used to execute the functions of of who is talking even in cases where multiple participants 

sound control module 48, channel buffers 52 and overflow are talking. Further, it is possible to visually track the 

buffer 54 are placed in a virtual memory that is visible to all emotional state of each participant by providing graduated 

of the threads or processes. There are several techniques visual icon IDs 206 based upon visual ID state 212. An 

known to those of skill in the art for providing virtual 35 additional advantage of the present invention is that in some 

memory that is visible to multiple processes at once. For embodiments each participant is represented on the display 

example, the memory can be shared directly. When operat- 38 of client 22 (FIG. 1) without revealing the true identity, 

ing system 40 is a Microsoft Windows operating system, such as the name or source IP address, of the participant. 

WIN32 application programming interface ("API") func- Further, the determination of whether each participant in the 

tions can be used to define virtual memory that is accessible 40 multi-participant event is speaking may be made even at 

to multiple independent threads. WIN32 functions for man- times when the local participant is speaking because no 

aging virtual memory and the storage that backs it ("memory information is lost while the local participant is speaking, 

mapped files") such as CreateFileMapping and Map- Referring to FIG. 6, detailed processing steps are shown 

ViewOfFile may be used to make channel buffers 52 and for how visual identification module 60 may update the 

overflow buffer 54 visible to multiple independent threads. 4s visual ID 206 (FIG. 2) of each participant in a multi- 

When sound control module 48 is implemented in a participant event. In processing step 602, the visual ID state 

WIN32 embodiment, it will be appreciated that WIN32 212 of each participant in data structure 46 (FIG. 2) is set to 

system calls can be used to implement sound control module state "2" indicating that the associated participant is not 

48. Separate threads can enqueue and dequeue channel speaking. Then, packet controller 56 is queried to determine 

buffers 52 and overflow buffer 54. The channel buffers 52 50 whether a channel buffer 52 is empty. In the embodiment 

and overflow buffer 54 can be executed as one-way data illustrated in FIG. 6, channel buffers 52 are queried in a 

channels such as unnamed anonymous pipes or named sequential order from 1 to N where N represents the number 

pipes. Pipes provide a suitable device for passing data of channel buffers 52 in sound control module 48. However, 

between threads. Using a pipe, packets may pass through the in practice, no linear sampling of sequential channel buffers 

pipe in a first-in, first-out manner, like a queue. All thread 55 52 is necessary as long as each channel buffer 52 is sampled 

synchronization between the input and output ends of the on a periodic basis. When channel buffer i is empty (604- 

pipe may be handled automatically by operating system 40. Yes), i is advanced by one 620. If i is less than N (622-No), 

In the WIN32 environment, a pipe may be created using the process returns to processing step 604. When channel 

CreatePipe. Processes having the appropriate privileges may buffer i is not empty (604-No), packet controller is queried 

put packets on the pipe (enqueue the pipe) using the Write- 60 for the identity of the participant that packet controller 56 

File system call and remove packets from the pipe (dequeue has reserved for channel buffer i (Step 606). Then, the visual 

the pipe) using ReadFile. In still other preferred ID 206 of the participant identified in step 606 is set to "1" 

embodiments, channel buffers 52 and overflow buffer 54 (608). When i has surpassed the number of channel buffers 

may be implemented using Microsoft DirectSound and more 52 present in sound control module 48 (622- Yes), the visual 

specifically Id irectSou ndB uffe r and/or 65 ID 206 of each participant listed in participant data structure 

IDirectSound3DBuffer. In such embodiments, a buffer could 46 is updated on display 38 based upon the updated visual 

be defined by the Direct X API CreateSoundBuffer. ID state 212 values (624). 
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FIG. 7 provides a schematic illustration of the appearance 
of visual IDs 206 on display 38 after three different cycles 
of the process illustrated in FIG. 6. In FIG. 7a, visual 
identification module 60 determined that none of the par- 
ticipants in a multi-participant event are currently speaking. 
In such a case, each channel buffer 52 and overflow buffer 
54 is empty. FIG. 7b shows the status of display 38 after 
another cycle of the processing steps shown in FIG. 6. In 
FIG. 7b visual identification module 60 determined that 
participant 1 is speaking. Accordingly, unlike participants 2 
thru N, the visual ID state 212 of participant 1 is set to "1" 
and the visual ID 206 of the participant is hatched rather than 
blank. It will be appreciated that any number of alterations 
of visual ID 206 based upon the value of visual ID state 212 
(FIG. 2) may be chosen and that FIG. 7 merely serves to 
illustrate the concept. FIG. 7c shows the status of display 38 
after a third cycle of the processing steps of FIG. 4. In FIG. 
7c, participants 2 and 3 are now speaking and participant 1 
has stopped speaking. Accordingly, the visual ID 206 of 
participants is now hatched while all other participants are 
unhatched. 

It will be appreciated that one of the advantages of the 
present invention is that each participant in a multi- 
participant event can visually identify and track exactly who 
is in the multi-participant event. The visual ID 206 of each 
participant may be displayed in a unique position of display 
38 of client 22 (FIG. 1) as shown on the left section of FIGS. 
7a thru 7c. In addition, the visual ID 206 of each participant 
may be represented in a list format 208. When visual ID 206 
are presented in a list format, such as list format 208, 
characteristics associated with each of the participants may 
still be visually communicated. In one embodiment, the 
participants who are currently speaking are placed at the top 
of the list. Participants who are not speaking are bumped 
from the top of the list to lower positions on the list. The 
longer a participant has not spoken, therefore, the lower the 
participant will be on the list. Additional characteristics may 
be communicated using a list format such as list format 208. 
In one example, visual ID 206 is the name of the participant 
in a list and a graphic the is descriptive of the privileges 
associated with a participant which is displayed beside the 
name of the participant in list 206. In one aspect of this 
example, when the participant has moderation privileges, a 
special graphic that indicates that the participant such privi- 
leges is displayed next to the name of the participant. When 
the participant is speaking louder than a base reference state, 
the color used to display the participants name is red-shifted. 
One of skill in the art will appreciate that there are many 
other ways in which characteristics of speaking participants 
may be communicated using a list format 208. Further, it 
will be appreciated that list 208 could be used in conjunction 
with the independent display of each visual ID 206. as shown 
in the left portions of FIGS. 7a-7c or as a substitute to such 
independent display. 

The foregoing descriptions of specific embodiments of the 
present invention are presented for purposes of illustration 
and description. They are not intended to be exhaustive or to 
limit the invention to the precise forms disclosed. Obviously 
many modifications and variations are possible in view of 
the above teachings. The 'embodiments were chosen and 
described in order to best explain the principles of the 
invention and its practical applications to thereby enable 
others skilled in the art to best utilize the invention and 
various embodiments with various modifications as are 
suited to the particular used contemplated. It is intended that 
the scope of the invention be defined by the following claims 
and their equivalents. All references cited are incorporated 
by reference for all purposes. 
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We claim: 

1. A computer product for use in conjunction with a 
computer system, the computer program product comprising 
a computer readable storage medium and a computer pro- 
gram mechanism embedded therein, the computer program 
mechanism comprising: 

(i) a participant data structure comprising a plurality of 
participant records, each participant record associated 
with a different participant in a multi-participant event; 

(ii) an application module for providing a user interface to 
said multi-participant event; 

(iii) a sound control module for receiving a plurality of 
packets from a network connection, each said packet 
associated with a participant in said multi-participant 
event and including digitized speech from said 
participant, said sound controller comprising: 

a plurality of buffers, each buffer including instructions 
for managing a subset of said packets; 

a packet controller that includes instructions for deter- 
mining said participant associated with said packet 
and instructions for routing said packet to a buffer, 
wherein when said packet is routed to said buffer, 
said packet is managed by said buffer; and 

a visual identification module that includes instructions 
for visually identifying said participant in said multi- 
participant event and a characteristic associated with 
said participant; and 

(iv) a sound mixer that includes instructions for mixing 
digitized speech from at least one of said buffers to 
produce a signal that is presented to an output device. 

2. The computer product of claim 1 wherein said partici- 
pant record further includes a participant source identifier, a 
visual identifier associated with said participant, a position 
of said visual identifier on an output device, and a visual 
identifier state. 

3. The computer product of claim 2 wherein said visual 
identification module farther includes instructions for updat- 
ing said visual identifier state in a participant record asso- 
ciated with said participant, wherein: 

when said participant is speaking in said multi-participant 
event, said visual identifier is set to a first state; and 

when said participant is not speaking in said multi- 
participant event, said visual identifier is set to a second 
state. 

4. The computer product of claim 3 wherein said appli- 
cation module includes instructions for displaying said 
visual identifier on an output device based upon said char- 
acteristic associated with said participant. 

5. The computer product of claim 2 wherein said appli- 
cation module includes instructions for displaying said 
visual identifier on an output device, wherein: 

when said participant associated with said visual identifier 
is speaking in said multi-participant event said unique 
visual identifier is displayed in a first state; and 

when said participant associated with said visual identifier 
is not speaking in said multi-participant event said 
visual identifier is displayed in a second state. 

6. The computer product of claim 2 wherein: 

(i) said participant record further includes a reference 
speech amplitude associated with said participant; and 

(ii) said visual identification module further includes: 
instructions for determining a buffered speech ampli- 
tude based upon a characteristic of digitized speech 
in at least one packet, associated with said 
participant, that is managed by a buffer; 



03/31/2004, EAST version: 1.4.1 



US 6,192395 Bl 



15 



16 



instructions for computing a speech amplitude differ- 
ential based on said buffered speech amplitude and 
said reference speech amplitude; 

instructions for updating said visual identifier associ- 
ated with said participant based on said speech 
amplitude differential; and 

instructions for storing said buffered speech amplitude 
as said reference speech amplitude in said participant 
record. 

7. The computer product of claim 6 wherein said appli- 
cation module includes instructions for displaying said 
visual identifier on an output device based on a function of 
said visual identifier state in said participant record; and 

said instructions for updating said visual identifier asso- 
ciated with said participant based on said speech ampli- 
tude differential includes instructions for updating said 
visual identifier state based upon a value of said speech 
amplitude differential. 

8. The computer product of claim 1 wherein said multi- 
participant event is selected from the group consisting of an 
audio conference and an on-line game. 

9. The computer product of claim 1 wherein: 

said plurality of buffers includes an overflow buffer and a 
plurality of channel buffers, wherein: 

(i) when a packet is present in a channel buffer, said 
channel buffer is characterized by an identity of said 
packet; and 

(ii) when no packet is present in a channel buffer said 
channel buffer is available; and 

said instructions for routing said packet to said buffer 
includes instructions for comparing an identity of the 
participant associated- with said packet with said iden- 
tity that characterizes said channel buffer, wherein: 

(a) when said identity of the participant associated with 
said packet matches said identity characterizing said 
channel buffer, said packet is routed to said channel 
buffer; and 

(b) when said identity of the participant associated with 
said packet does not match said identity that char- 
acterizes a channel buffer, said packet is routed to an 
available channel buffer, a ad when no channel buffer 
in said plurality of channel buffers is available, said 
packet is routed to said overflow buffer. 

10. The computer product of claim 1 wherein said instruc- 
tions for managing said subset of said packets is first in first 
out. 

11. The computer product of claim 2 wherein said par- 
ticipant source identifier is a temporary unique number 
assigned to said participant for the duration of said multi- 
participant event. 

12. The computer product of claim 2 wherein said packet 
comprises a packet header and a formatted payload and said 
formatted payload includes said participant source identifier, 
a packet data size, and said digitized speech from said 
participant. 

13. The computer product of claim 9 wherein said sound 
mixer further includes: 

instructions for retrieving a portion of said digitized 
speech from a first packet in each said channel buffer; 
and 

instructions for combining each said portion of said 
digitized speech into said mixed digitized signal. 

14. The computer product of claim 13 wherein said 
portion of said digitized speech is ten milliseconds. 

15. The computer product of claim 1 wherein said char- 
acteristic associated with said participant is selected from 
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the group consisting of (i) whether said participant is asso- 
ciated with a channel buffer, (ii) whether said participant has 
moderation privileges, (iii) whether said participant has 
placed said multi-participant event on hold and (iv) whether 
said participant has specified that he is away from the 
keyboard. 

16. A method for visually identifying speaking partici- 
pants in a multi-participant event, said method comprising 
the steps of: 

receiving a packet from a remote source; 

determining an identity associated with said packet; 

comparing said identity of said packet with an identity 
associated with a channel buffer selected from a plu- 
rality of channel buffers; wherein, said identity associ- 
ated with said channel buffer is determined by an 
identity of a packet stored by said channel buffer when 
said channel buffer is storing a packet, and said channel 
buffer is available when no packet is stored by said' 
channel buffer; 

routing said packet to: 

(i) a channel buffer when said identity of said packet 
matches said identity associated with said channel 
buffer; 

(ii) an available channel buffer when said identity of 
said packet does not match an identity of a channel 
buffer; and 

(iii) an overflow buffer when said identity of said 
packet does not match an identity of a channel buffer 
and there is no available channel buffer; and 

associating a different visual identifier with each partici- 
pant in said multi-participant event; 

displaying each said different visual identifier on an 
output device; wherein said different visual identifier is 
determined by a characteristic associated with said 
participant. 

17. The method of claim 16 further comprising the step of: 
updating a visual identifier state in a participant record, 

said visual identifier state determined by whether an 
identity of a participant corresponding to said partici- 
pant record matches an identity associated with a 
channel buffer. 

18. The method of claim 17 wherein said updating step 
further comprises: 

determining a difference between a characteristic of said 
packet and a reference characteristic stored in said 
participant record; and 

setting said visual identifier state based upon said differ- 
ence. 

19. The method of claim 16 wherein said multi-participant 
event includes a local participant and at least one remote 
participant, said method further comprising the steps of: 

accepting a frame of sound from an input device; 

deriving acoustic parameters from the content of said 
frame of sound; 

performing an acoustic function using said acoustic 
parameters to determine whether said frame of sound 
includes speech from said local participant; 

updating a visual identifier state in a participant record 
associated with said local participant, said visual iden- 
tifier state determined by whether said frame of sound 
includes speech from said local participant. 

20. The method of claim 16 wherein said multi-participant 
event is selected from the group consisting of an audio 
conference and an on-line game. 

21. The method of claim 16 further including the step of 
assigning a temporary number to a participant for the 
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duration of said multi-participant event; wherein said tem- 
porary number provides an identity to said participant. 

22. The method of claim. 16 further including the steps of: 
mixing sound from each channel buffer; and 
presenting said mixed sound to an output device. 

23. The method of claim 22 wherein said mixing step 
further includes the steps of: 

retrieving a portion of digitized speech from a first packet 

in each said channel buffer; and 
combining each said portion of said digitized speech into 

a mixed digitized signal. 
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24. The method of claim 23 wherein said portion of said 
digitized speech is ten milliseconds. 

25. The method of claim 16 wherein said characteristic 
associated with said participant is selected from the group 
consisting of (i) whether said participant is associated with 
a channel buffer, (ii) whether said participant has moderation 
privileges, (iii) whether said participant has placed said 
multi-participant event on hold and (iv) whether said par- 
ticipant has specified that he is away from the keyboard. 
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